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Abstract: An outgroup roots a network to form a tree and/or to infer hypothetical ancestral character states. 
Usually, multiple taxa of a closely related sister group of the ingroup are selected. To empirically evaluate 
the choice of outgroup, we implemented three strategies of outgroup selection: a single taxon from the sister 
group, multiple taxa within the sister group, and multiple taxa from successive sister groups. Subsequently , 
we evaluated their effects on tree topologies within the family Halictidae ( Hymenoptera: Apoidea ) 
incorporating three tree reconstruction methods: maximum likelihood, maximum parsimony and Bayesian 
inference. The use of multiple taxa within the sister group produced more consistent results than the other 
two outgroup strategies. The tree topologies were generally consistent with the putative tree topology of 
Halictidae. Compared with the other two tree reconstruction methods, maximum parsimony produced more 
consistent results with different outgroup strategies, yet often obtained less resolution. 
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1 INTRODUCTION 


1.1 Importance of outgroup selection 


As rigorous methods of tree reconstruction 
developed, they were widely applied in phylogenetics, 
molecular ecology, bioconservation, disease control, 
DNA barcoding and other fields ( Cavalli-Sforza and 
Edwards, 1967; Phillips et al., 2000; Sanderson and 
Shaffer, 2002 ). 


relationships. The strength of the hypotheses depends 


Phylogenies are hypotheses of 
on the assumptions of each method of phylogenetic 
inference. Various assumptions and factors exert effects 
on tree topologies whether the target organisms include 
prokaryotes or eukaryotes. For DNA sequence data, 
the factors can include taxon sampling, alignment 
methodology, gene selection, gene sequence length, 
data treatment, optimality criterion, parameter values 


and others (Smith, 1994; Dalevi et al., 2001; 


Cameron et al., 2004; Ware et al., 2008). Among 
these explicit and implicit factors, outgroup selection is 
usually inadequately considered. Apparently, outgroup 
selection is often arbitrary or based on obscure 
relationships between outgroup and ingroup taxa 
(Lyons-Weiler et al., 1998; Sanderson and Shaffer, 
2002; Cameron et al., 2004). 

Traditionally, an outgroup serves to root unrooted 
networks and/or to infer hypothetical ancestral states 
( Watrous and Wheeler, 1981; Maddison et al., 1984; 
Wheeler, 1990; Smith, 1994; Lyons-Weiler et al., 
1998; Sanderson and Shaffer, 2002). The ingroup 
should not be studied in isolation. Outgroup selection 
is critical because topology of the ingroup tree can vary 
with the choice of outgroup taxa ( Milinkovitch and 
Lyons-Weiler, 1998; Tarrio et al., 2000; Cameron et 
al., 2004; Ware et al., 2008). For example, using 
the Mollusca + Annelida as an outgroup of 
Arthropoda, Nardi et al. (2003) challenged the 
traditional concept of Hexapoda forming a monophyletic 
clade. However, when different taxa were used as 
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outgroup, the phylogeny of Nardi et al. (2003) was 
not consistently obtained (Cameron et al., 2004). In 
addition, the research of Ware et al. (2008) on the 
relationships for Dictyoptera showed that outgroup 
selection influenced tree topologies to the extent that 
unexpected placements were obtained for some taxa. 
1.2 Strategies of outgroup selection 

Outgroup selection usually conforms to several 
principles. Whenever possible, the outgroup should be 
outside of, but closely related to the ingroup, and 
preferably the sister group of the ingroup, and the 
outgroup should contain more than one taxon; these 
guidelines help to avoid erroneous phylogenetic signals 
that can result in “random outgroup effect” and long 
branch attraction (LBA) sometimes associated with 
root placement ( Watrous and Wheeler, 1981; 
Maddison et al., 1984; Wheeler, 1990; Tarrio et al., 
2000; Dalevi et al., 2001; Graham et al., 2002; 
Sanderson and Shaffer, 2002; Cameron et al., 2004; 
Bergsten , 2005). Nevertheless, the use of a closely 
related sister group may not always lead to the correct 
hypothesis of phylogenetic relationships, especially 
when evolutionary rates are relatively high ( Lyons- 
Weiler et al., 1998; Sanderson and Shaffer, 2002 ) , 
or there is only a single taxon in the sister group 
(Smith, 1994; Sanderson and Shaffer, 2002). 

Among the previous studies investigating outgroup 
Smith ( 1994 ) 


proposed three general strategies of outgroup selection 


selection for tree reconstruction, 


specifically for rooting molecular trees: a single taxon 
outgroup, multiple taxa within a single sister group, 
and single taxa from successive sister groups. He 
pointed out that, compared with the other two 
strategies, the option of multiple taxa within a single 
sister group was theoretically better based on his 
viewpoints of tree balance and homoplasy. This 
suggestion was corroborated by his empirical test of the 
echnoids (the ingroup). However, this test nowadays 
seems to be unconvincing because of insufficient data 
(six taxa) and several methodological concerns (e. g., 
no statistical test for nodal strength, and compounding 
factors affecting tree topologies). Our study followed 
these three strategies while investigating phylogenetic 
relationships within the family Halictidae. Because it is 
unlikely to encounter a relative rate speedup in the 
sister group (Sanderson and Shaffer, 2002), we just 
assumed this has not occurred. 

1.3 Phylogenetic relationships of the Halictidae 
Halictidae 
(Hymenoptera: Apoidea) is a group of short-tongued 
(S-T) bees. It is the second largest family in Apoidea 
with more than 4 150 described species ( http:// 


The monophyletic family 


pickl14. pick. uga. edu/mp/20q? guide = Apoidea _ 


species) , of which some species are the commonest 


bees ( Packer and Taylor, 1997; Alexander and 


Michener, 1995; Michener, 2000). Other than the 
genus Apis ( Apidae) , halictids dominate other bees in 
numbers of individuals in many temperate areas 
( Michener, 2000 ). 
behavior has made them a model group for studying 
social evolution ( Danforth et al., 2008). 

Generally , Halictidae contains four monophyletic 


The great diversity in social 


subfamilies: Halictinae, Rophitinae, Nomiinae and 
Nomioidinae. Although this subfamily classification is 
phylogenetic 
relationships are supported by both morphological and 
molecular evidence ( Pesenko, 1999; Michener, 
2000; Danforth et al., 2004; 2008). The generally 


accepted subfamily level relationships are as follows: 


not universally accepted, their 


( Rophitinae ( Nomiinae ( Nomioidinae + 
Halictinae ) ) ). At the family level, Halictidae, 
Colletidae and Andrenidae of S-T bees are putatively 
monophyletic, but Stenotritidae is resolved either 
branching off within or as the sister group of Colletidae 
( Michener, 2000; Danforth et al., 2006, 2006b ). 
The phylogenetic relationships among them are as 
follows; ( Andrenidae ( Halictidae ( Colletidae + 
Stenotritidae ) ) ) ( Danforth et al., 2006a, 2006b). In 
the long-tongued bees composed of the Megachilidae 
and the Apidae, the Megachilidae is monophyletic and 
falls outside of the family Halictidae ( Danforth et al., 
2006a, 2006b). 

Herein, we investigate the effect of outgroup 
selection on the construction of trees using a molecular 
Our outgroup 
sister groups of 
Halictidae as follows: Strategy | , a single taxon in 


phylogenetic analysis of Halictidae. 
selection strategies involve the 
the sister group, but one that differs from Smith’ s 
(1994) single taxon outgroup; Strategy II, multiple 
taxa within the sister group, which is similar to Smith’ 
s grouping; Strategy 人, multiple taxa from successive 
sister groups, different from his use of a single taxon 
from successive sister groups. Tree topologies derived 
from datasets with different strategies of outgroup 
selection would serve to evaluate the effects of outgroup 
selection. 


2 METHODS AND MATERIALS 


2.1 Gene marker 

The gene 28S rDNA D2-D3 was used as the 
molecular marker because of its large proportion of 
potentially phylogenetically informative characters 
(Hancock and Dover, 1988; Hancock et al., 1988; 
Tautz et al., 1988) and widespread implementation to 
infer higher level relationships (De Rijk et al., 1995; 
Schnare et al., 1996; Danforth et al., 2006a, 
2006b). A moderate number of D2-D3 sequences from 
the family Halictidae and related taxa were taken from 
GenBank (http://www. ncbi. nlm. nih. gov/). 
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2.2 Taxa and datasets 

Table 1 lists the taxa and GenBank accession 
numbers. For the family Halictidae, all 24 D2-D3 
sequences available in GenBank ( till 6/27/2008 ) 
were used as the ingroup. This sampling included its 
four subfamilies; Halictinae (6), Nomiinae (8), 
Nomioidinae (1) and Rophitinae (9). The outgroup 
included four taxa representing three of the four 


subfamilies of Andrenidae, seven taxa representing 
seven subfamilies of Colletidae, and three taxa 
representing two subfamilies of Megachilidae. These 
outgroup taxa were all chosen randomly to represent 
their taxonomic group. Besides, there was only one 
taxon from GenBank to represent the family 
Stenotritidae. All other apoids other than halictids can 
be used as potential outgroup members to Halictidae. 


Table 1 Taxonomic diversity and nucleotide sequences obtained from GenBank 
used to investigate the effect of outgroup selection 





IG denotes ingroup taxa, while OG denotes outgroup taxa. 


Families Subfamilies Species GenBank accession no. Taxon no. 
Halictidae Halictinae Agapostemon tyleri AY654506 IG 1 
Halictinae Augochlorella pomoniella AY654507 IG 2 
Halictinae Halictus rubicundus AY654510 IG 3 
Halictinae Sphecodes pecosensis DQ072154 IG 4 
Halictinae Sphecodes sp. Spsp1055 DQ072155 IG 5 
Halictinae Patellapis ( Zonalictus) sp. BND-2006 DQ060870 IG 6 
Nomiinae Dieunomia heteropoda DQ072151 IG 7 
Nomiinae Dieunomia nevadensis AY654512 IG 8 
Nomiinae Dieunomia nevadensis DQ060852 IG 9 
Nomiinae Lipotriches patellifera DQ072146 IG 10 
Nomiinae Macronomia aureozonata DQ072149 IG 11 
Nomiinae Nomia tetrazonata DQO72152 IG 12 
Nomiinae Pseudapis obesula DQ060868 IG 13 
Nomiinae Pseudapis unidentata AY654514 IG 14 
Nomioidinae Nomioides facilis AY654511 IG 15 
Rophitinae Conanthalictus conanthi DQ072144 IG 16 
Rophitinae Conanthalictus wilmattae AY654508 IG 17 
Rophitinae Dufourea mulleri AY654509 IG 18 
Rophitinae Penapis penai AY654513 IG 19 
Rophitinae Rophites algirus AY654515 IG 20 
Rophitinae Rophites algirus DQ072159 IG 21 
Rophitinae Systropha curvicornis AY654516 IG 22 
Rophitinae Systropha glabriventris DQ072156 IG 23 
Rophitinae Xeralictus bicuspidariae AY654517 IG 24 
Andrenidae Andreninae Andrena nasonii DQ060849 OG 25 
Oxaeinae Protoxaea gloriosa AY654480 OG 26 
Panurginae Calliopsis subalpinus DQ060850 OG 27 
Panurginae Protandrena nanulus DQ060857 OG 28 
Colletidae Colletinae Colletes graeffei EF363690 OG 29 
Diphaglossinae Caupolicana vestita AY654486 OG 30 
Euryglossinae Euryglossina globuliceps AY654490 OG 31 
Hylaeinae Hylaeus proximus AY654493 OG 32 
Paracolletinae Leioproctus irroratus AY654495 OG 33 
Scraptrinae Scrapter niger AY654501 OG 34 
Xeromelissinae Chilimelissa rozeni AY654481 OG 35 
Megachilidae Fideliinae Fidelia major AY654539 OG 36 
Megachilinae Lithurgus apicalis DQ072145 OG 37 
Megachilinae Megachile pugnata AY654543 OG 38 
Stenotritidae 一 Stenotritus sp. AY654503 OG 39 
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Four datasets were generated using the three strategies 
of outgroup selection (Table 2). Considering the family 
level relationships within the superfamily Apoidea, 
Stenotritidae (dataset I, 25 taxa) was selected to be the 
outgroup of Strategy I. Strategy II used Colletidae + 
Stenotritidae (dataset IT, 32 taxa) as the outgroup. Strategy 


III contained Andrenidae + Colletidae + Stenotritidae 
(dataset Il-a, 36 taxa) and Andrenidae + Colletidae + 
Megachilidae + Stenotritidae (dataset III-b, 39 taxa) as 
the outgroup. We also used seven extended datasets, in 
each of which one species from the seven taxa (as above ) 
of Colletidae was used as the outgroup. 


Table 2 Four standard datasets and extended datasets in the study of the effect of outgroup selection 


Standard datasets 
Code I II Il-a III-b 
Taxa IG + (39) I + (29 -35) II + (25 -28) 


H-a + (36 -38) 


Extended datasets 
I-a I-b I-c nee I-g 
IG + (29) IG + (30) IG + (31) IG + (35) 


I, Il, and III correspond to the three strategies of outgroup selection. Numbers in parentheses correspond to outgroup taxon numbers in Table 1. 


2.3 Sequence alignments 

All sequences (Table 1) were prepared for 
multiple sequence alignment. To avoid an alignment 
being dependent on the removed taxon sequence 
(Cameron et al., 2004), 24 sequences of the 
Halictidae in combination with sequences of outgroup 
taxa/taxon were independently aligned using ClustalW 
(Thompson et al., 1994) with default parameters. A 
final adjustment was made by eye. The end regions 
were truncated to ensure that only the D2-D3 region 
was used, with reference to the 28S rDNA sequence 
from the honey bee, Apis mellifera ( Gillespie et al., 
2006 ). Alignment statistics were evaluated using 
PAUP* v.4.0b10 (Swofford, 2002). 

2.4 Tree reconstruction 

Phylogenetic analysis was performed with both 
PAUP* v.4.0b10 (Swofford, 2002) and MrBayes v. 
3.1.2 (Huelsenbeck and Ronquist, 2001) for each of 
the standard datasets (I, II, Ill-a, II-b). For the 
extended datasets, Bayesian inference was not feasible 
because of long computation times. 

Maximum likelihood Modeltest v.3.7 (Posada 
and Crandall, 1998; Posada and Buckley, 2004) was 
first employed to select the appropriate DNA 
substitution model for our aligned NEXUS files. After 
the favored model (TVM + I + G) and certain 
parameter values as a block appended to the original 
NEXUS files, the analyses were performed in PAUP” 
v. 4. Ob10 (Swofford, 2002). They were analyzed 
using heuristic analysis under the criterion of likelihood 
with 100 random addition sequence replications and 
tree bisection reconnection (TBR) branch swapping. 
We generally got one maximum likelihood (ML) tree 
for each dataset. 

Maximum parsimony Trees were generated by 
PAUP” v. 4. 0b10 (Swofford, 2002). The heuristic 


search involved 2 000 random addition sequence 


replications under tree bisection reconnection ( TBR ) 
branch swapping. A 50% majority rule consensus tree 
was used for comparison ( MP tree). 

Bayesian inference First, MrModeltest v. 2. 2 
(Nylander, 2004; Posada and Buckley, 2004) was 
specifically employed to select the appropriate DNA 
substitution model for Bayesian inference ( BI). Two 
independent Markov Chain Monte Carlo ( MCMC ) 
analyses of 10 million iterations were performed in 
MrBayes v.3.1.2, each with 4 chains, three hot, one 
cold, sampling one tree per 100 iterations 
( Huelsenbeck and Ronquist, 2001). The “sump” 
command together with “burnin =25 000 (25% of the 
samples )” was used to determine the appropriate 
“bumin”. If the parameters summarized showed that 
the potential scaled reduction factor ( PSRF ) was 
reasonably close to 1.0, “sumt burnin = 25 000” was 
then used to pool trees after discarding the “burnin”. 
2.5 Statistical analysis 

Nonparametric bootstrapping was employed to 
estimate branch support of the MP and the ML trees. For 
the MP trees, we used 1 000 replications. For the ML 
trees, due to the huge computation times required, we 
used 500 replications. BI Posterior probabilities were used 
to estimate the reliability of the BI tree topologies. 


3 RESULTS 


3.1 Sequence alignments 

The gene sequences of 28S rDNA D2-D3 
contained 656 nucleotide sites, corresponding to the 
28S sequence of the honey bee, Apis mellifera, from 
the end of Helix 531 to the beginning of Helix 589’. 
After alignments by ClustalW , both the number of total 
number 


sites and the of potentially parsimony 


informative sites in each of the four standard datasets 


differed (Table 3). 


Table 3 Alignment statistics for four standard datasets 


Datasets Constant sites Variable but uninformative sites 
I 443 (62.75% ) 103 (14.59% ) 
II 406 (56.31% ) 109 (15.12% ) 
Il-a 368 (51.18% ) 114 (15. 86% ) 


III-b 350 (48.21% ) 108 (14.88% ) 


Parsimony-informative sites Total sites 


160 (22. 66% ) 706 
206 (28.57% ) 721 
237 (32.96% ) 719 
268 (36.91% ) 726 
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3.2 Tree topologies clustered with Nomiinae rather than Halictinae; this 
Dataset I Monophyly of the subfamilies, except arrangement did not support monophyly of either 
Nomioidinae, was supported by ML and BI analyses Halictinae or Nomiinae (Table 4). 
(Fig. 1). With MP, Augochlorella pomoniella 
Table 4 Summary of tree topologies for four standard datasets from three tree reconstruction methods 


Dataset + Method Nomiinae Halictinae Rophitinae Nomiinae + Halictinae Halictidae 
I-ML V V V V V 
I-MP x x V V vV 
I-BI V V V V V 

II-ML vV vV vV y vV 
II-MP x x V V vV 
II-BI V V V y V 
lI-a-ML vV x V V x X 
II-a-MP vV x V V XxX 
I-a-BI V x V vV x X 
III-b-ML V V V V V 
III-b-MP V vV vV V X X 
III-b-BIi V x vV V V 


ML: Maximum likelihood; MP; Maximum parsimony; BI; Bayesian inference. “\/” denotes that the putative monophyly of a certain taxonomic group is 


éé 


corroborated by our results; “ xx” denotes that the monophyly is not obtained in our results due to the unexpected positions of some taxa; “ x” denotes 
the uncorroborated monophyly of Nomiinae and Halictinae due to the position of Augochlorella pomoniella. The position of Nomioides facilis in tree 


topologies was neglected in this summary. 


The MP and ML trees were evaluated for the relationships of a few taxa (e. g., Nomioides facilis and 
extended datasets only. When Colletes graeffei was Augochlorella pomoniella ) . 
used as the outgroup, tree topologies were similar to Dataset II The ML and BI trees (Fig. 2), 
those of dataset I. Using Hylaeus proximus as the resolved the monophyly of three subfamilies and their 
outgroup , unresolved internal phylogenetic current phylogenetic relationships. Again, the 
relationships were obtained within Halictidae. In the monophyly of Halictinae was not supported by MP tree 
other trials, tree topologies were generally consistent because of the position of Augochlorella pomoniella. 


with the accepted taxonomy except for the phylogenetic 
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Fig. 1 Tree topology for halictid bees derived from maximum likelihood and Bayesian inference methods 
using DNA sequence data in the standard dataset I 
Node labels are given as bootstrap values/Bayesian posterior probabilities. Values contrary to the majority rule are not shown. Blue background represents 
the taxa in Nomiinae; red, the Halictinae; green, the Rophitinae; gray, the outgroup; and one yellow leaf, the Nomioidinae. 
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Fig. 2 Tree topology for halictid bees derived from maximum likelihood and Bayesian inference methods 
using DNA sequence data in standard dataset II 
Node labels are given as bootstrap values/ Bayesian posterior probabilities. Values contrary to the majority rule are not shown. Blue background represents 
the taxa in Nomiinae; red, Halictinae; green, Rophitinae; gray, the outgroup; and one yellow leaf, Nomioidinae. 


Dataset I-a When four representatives of 
Andrenidae were added to the outgroup ( dataset Il-a), 
the monophyly of the family Halictidae was not resolved 
in any of the three analyses, although the monophyly of 
Rophitinae was always supported. In the ML and BI 
trees, four representatives of Andrenidae, part of the 


Nomioides facilis as their sister group (Fig. 3). 
However, MP tree united the representatives of 
Andrenidae with the subfamily Rophitinae in a 
paraphyletic clade at the base of Nomiinae + Halictinae 
+ Nomioides facilis. Another common characteristic of 
these four trees was that Augochlorella pomoniella 


outgroup, clustered with Nomiinae + Halictinae + clustered at the base of Nomiinae + Nomioides facilis. 
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Fig. 3 Tree topology for halictid bees derived from maximum likelihood and Bayesian inference methods 
using DNA sequence data in standard dataset II-a 
Node labels are given as bootstrap values/ Bayesian posterior probabilities. Values contrary to the majority rule are not shown. Blue background represents 
the taxa in Nomiinae; red, the Halictinae; green, the Rophitinae; gray, the outgroup; and one yellow leaf, the Nomioidinae. 
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Dataset III-b When the three representatives of tree, in which four representatives of Andrenidae 
Megachilidae were added to the outgroup ( dataset III- together with the three samples of Megachilidae 
b), the monophyly of the family Halictidae was clustered as the sister group of Nomiinae + Halictinae 
recovered (Fig. 4; Fig. 5) in all trees except for MP + Nomioides facilis. 
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Fig. 4 Tree topology for halictid bees derived from maximum likelihood using DNA sequence data in standard dataset III-b 
Node labels are bootstrap values. Blue background represents the taxa in Nomiinae; red, the Halictinae; green, the Rophitinae; gray, the outgroup; and 


one yellow leaf, the Nomioidinae. 

















AY654480 Protoxaea gloriosa 





AY 654543 Megachile_pugnata 
DQ072145 Lithurgus_apicalis 
idelis_maior 
DQ060857 Protandrena_nanulus 
DQ060850_ Calliopsis subalpinus 
0608 4ndrena_nasonii 
654486 Caupolicana_vestita 
AY654481 Chillimelissa_rozeni 
AY654504_ Scrapter_niger 
AY654490 Euryglossina_globuliceps 
Y654495 _Leioproctus_ irroratus 
654493 Hylaeus_proximus 
Senor sp EF363690_ Colletes_graeffei 
Y 654509 Dufourea mulleri 
DQ072156_Systropha_glabriventris 
AY654516_ Systropha_curvicornis 
654515 Rophites_algirus 
DQ072159 Rophites s 
DQ072144 Conanthalictus_conanthi 
AY654508_ Conanthalictus_wilmattae 

AY 654513 Penapis penai 

654517 Xeralictus_bicuspidariae 


AvY65450 和 39829222 
DQ060870 Patients Zonalieted ， sp_BND-2006 


DQ072154 Sphecodes_pecosensis 
DQ072155 Sphecodes sp Spsp1055 
654510 Halictus rubicundus 

AY654511_ Nomioides_facillis 
AY654512 Dieunomia_nevadensis 
DQ060852 Dieunomia_nevadensis 
DQ072151 Dieunomia_heteropoda 
BQ072149 Macronomia_aureozonata 
AY654514 Pseudapis_unidentata 
DQ060868 Pseudapis obesula 
DQ072146 Lipotriches_patellifera 
DQ072152 Nomia_tetrazonata 


Fig. 5 Tree topology for halictid bees derived from Bayesian inference using DNA sequence data in standard dataset III-b 
Noda labels are the Bayesian posterior probabilities. Blue background represents the taxa in Nomiinae; red, the Halictinae; green, the Rophitinae; gray, 


the outgroup; and one yellow leaf, the Nomioidinae. 
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4 DISCUSSION 


We selected the family Halictidae as the ingroup 
to evaluate the effect of outgroup selection on tree 
reconstruction. One gene marker (28S rDNA D2-D3 ) 
was employed because of its established phylogenetic 
signal and established resolving power within higher 
levels of Halictidae. We sought to remove many factors 
that could affect tree topology by controlling variables 


such as and number of tree 


sequence region 
generations. Then, differences in tree topologies 
should have reflected differences in outgroup 
composition and method of tree reconstruction. 


Nevertheless, the effect of certain gene marker (i. e., 
28S rDNA D2-D3 in our case) cannot be removed, 
and further research on other genes still needs to be 
done in the future. 

Our evaluation assumed that the putative topology 
of Halictidae by Danforth et al. was correct due to 
sufficient taxon sampling, the incorporation of three 
gene markers and the application of powerful methods 
of analysis ( Danforth et al., 2004, 2008). Nomioides 
facilis was seldom considered in the assessment of 
accuracy because its usual grouping with genus 
Dieunomia may have owed to long branch attraction 
(Figs. 1 - 3), and/or inadequate taxon sampling. 
Unfortunately, only one sequence was available for 
Without 
facilis, some trees we got supported the putative 
phylogeny, but others did not. 


Nomioidinae. considering the Nomioides 


Our analysis demonstrated that multiple taxa 
within the nearest sister group should be used when 
forming outgroups. Trees rooted with a greater diversity 
of immediate sister group species had topologies most 
congruent with the putative phylogeny. This conclusion 
is tempered by the discovery that multiple taxa from 
successive sister groups resulted in very unstable tree 
topologies, even when multiple taxa of the sister group 
of Halictidae were included in the outgroup. The effect 
was so great that more distant sister group 
( Andrenidae ) clustered with one subclade of the 
ingroup, and thus rearranged the ingroup tree topology 
greatly. When we adopted Strategy III for outgroup 
formation, we obtained an untenable topology in the 
ingroup. Conversely, application of Strategy I on the 
standard dataset together with the extended datasets 
revealed that a single taxon in sister group could 
represent the outgroup, if the selected taxon accurately 
polarized characters in the ingroup, and depending on 
the method of tree reconstruction. This discovery 


suggested that phylogenies may be consistently 
reconstructed even when the outgroup contains few 
taxa, as long as the nearest sister group taxon is used. 


Synapomorphies are the fundamental basis of 


phylogeny reconstruction. Unlike many classes of 
morphological characters, nucleotide sequences only 
have five possible character states, including indels. 
Outgroup choice should be critical because homoplasy 
owing to parallel change could be a troublesome factor 
in tree reconstruction. In our case, Andrenidae was 
used as the most distantly related outgroup member to 
Halictidae. This added 
outgroup taxa that fell outside of the ingroup-plus-first- 
outgroup (Sanderson and Shaffer, 2002). Did parallel 


nucleotide substitutions result in homoplasy at some 


arrangement affectively 


nucleotide positions to the extent of causing changes in 
the tree? When Andrenidae was used as the outgroup 
in dataset II-a, key nucleotide sites had homoplastic 
signals that broke the monophyly of Halictidae. 
However, when three taxa in Megachilidae were 
added, much of this confusion was eliminated because 
the Megachilidae taxa avoided homoplastic signals to 
influence the tree reconstruction; the monophyly of 
Halictidae was recovered again. For single taxon 
outgroups, even the nearest sister group, homoplasy 
should exist (Smith, 1994), but the extent of 
homoplasy and the attainment of false phylogenetic 
relationships depends on the taxon itself. Depending on 
the outgroup topologies were 
inconsistent with the putative tree (e. g., extended 


taxon, some tree 


dataset I-e) , yet others were largely congruent (e. g., 
dataset I). However, multiple taxa within the sister 
group could shelter homoplasy coming from certain 
taxon, and the whole of them tend to result in correct 
phylogeny. 

Three different methods of tree reconstruction were 
employed to evaluate the consistency of tree topology 
for each standard dataset. Although the assumptions 
differ from the optimality criteria to the extent of some 
having advantages (e. g., realism, generality, and 
economy of assumptions ) over others ( Goloboff, 
2003 ) ， 
resolution than ML and BI approaches. However, MP 


MP seemed to consistently provide less 


produced more consistent results depending on 
outgroup strategy. Thus, outgroup choice may be 
particularly critical for model-based approaches to tree 
reconstruction, in particular ML and BI, at least in 
this case study. Further trials are required to determine 


whether this finding is a generality, or not. 
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外 群 选择 对 隧 蜂 科 ( A H : Be RE A F) 
系统 重建 的 影响 


FR, KEA IRR, Wie Robert W. MURPHY’, RHR?” 
《1 中国 科学 院 动物 研究 所 动物 系统 与 进化 重点 实验 室 ,北京 100101; 2. 中 国 科学 院 研 究 生 院 , 北 京 100049 ; 


3. UCD Conway Institute of Biomolecular and Biomedical Sciences, University College Dublin, Dublin 4 Ireland; 
4. Department of Natural History, Royal Ontario Museum, 100 Queen’ s Park, Toronto, ON, M5S 2C6, Canada) 


摘要 : 外 群 用 于 给 树 附 根 和 推断 祖先 性 状 状态 。 通 常 , 来 自 内 群 的 姐妹 群 中 的 多 个 分 类 单元 被 共同 选择 作为 外 群 。 
为 了 在 经 验 上 验证 这 一 方法 , 我 们 采用 了 3 种 外 群 选择 策略 : 姐妹 群 中 的 单一 分 类 单元 , 姐妹 群 中 的 多 个 分 类 单元 
和 连续 姐妹 群 中 的 多 个 分 类 单元 。 以 障 蜂 科 ( 膜 翅 目 : 蜜蜂 总 科 ) 的 系统 发 育 重建 为 例 , 我 们 评估 了 这 3 种 策略 对 树 
拓扑 结构 的 影响 , 包括 最 大 似 然 树 、 最 大 简约 树 和 贝 叶 斯 树 。 初 步 结 采 表 明 : 相 比 其 他 两 种 策略 , 采用 姐妹 群 中 的 
多 个 分 类 单元 作为 外 群 更 有 利于 系统 发 育 重建 得 到 现 已 被 广泛 认可 的 隧 蜂 科 系 统 发 育 关系 ; 相 比 最 大 似 然 法 和 贝 叶 
斯 法 , 虽然 隧 蜂 科 系 统 发 育 关系 没有 被 很 好 地 解决 , 但 最 大 简约 法 在 不 同 外 群 选择 策略 下 得 到 了 较为 一 致 的 拓扑 
结构 。 

关键 词 : 隧 蜂 科 ; 非 同 源 相似 ; 单 系 性 ; 外 群 ; 姐妹 群 ; 系统 发 育 
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